Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

نویسندگان

چکیده

This paper investigates the stochastic optimization problem focusing on developing scalable parallel algorithms for deep learning tasks. Our solution involves a reformation of objective function in neural network models, along with novel computing strategy, coined weighted aggregating gradient descent ( WASGD ). Following theoretical analysis characteristics new function, introduces decentralized scheme based performance local workers. Without any center variable, method automatically gauges importance workers and accepts them by their contributions. Furthermore, we have developed an enhanced version method, xmlns:xlink="http://www.w3.org/1999/xlink">WASGD+ , (1) implementing designed sample order (2) upgrading weight evaluation function. To validate benchmark our pipeline against several popular including state-of-the-art classifier training techniques (e.g., elastic averaging SGD). Comprehensive validation studies been conducted four classic datasets: xmlns:xlink="http://www.w3.org/1999/xlink">CIFAR-100 xmlns:xlink="http://www.w3.org/1999/xlink">CIFAR-10 xmlns:xlink="http://www.w3.org/1999/xlink">Fashion-MNIST xmlns:xlink="http://www.w3.org/1999/xlink">MNIST . Subsequent results firmly validated superiority accelerating architecture. Better still, version, is shown to be significant improvement over its prototype.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conflict Graphs for Parallel Stochastic Gradient Descent

We present various methods for inducing a conflict graph in order to effectively parallelize Pegasos. Pegasos is a stochastic sub-gradient descent algorithm for solving the Support Vector Machine (SVM) optimization problem [3]. In particular, we introduce a binary treebased conflict graph that matches convergence of a wellknown parallel implementation of stochastic gradient descent, know as HOG...

متن کامل

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent

We design and implement a distributed multinode synchronous SGD algorithm, without altering hyperparameters, or compressing data, or altering algorithmic behavior. We perform a detailed analysis of scaling, and identify optimal design points for different networks. We demonstrate scaling of CNNs on 100s of nodes, and present what we believe to be record training throughputs. A 512 minibatch VGG...

متن کامل

Batched Stochastic Gradient Descent with Weighted Sampling

We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the convergence rate is provably possible compared to either batched sampling or weighted sampling alone. We propose several computationally efficient schemes to app...

متن کامل

Annealed Gradient Descent for Deep Learning

Stochastic gradient descent (SGD) has been regarded as a successful optimization algorithm in machine learning. In this paper, we propose a novel annealed gradient descent (AGD) method for non-convex optimization in deep learning. AGD optimizes a sequence of gradually improved smoother mosaic functions that approximate the original non-convex objective function according to an annealing schedul...

متن کامل

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2022

ISSN: ['1558-2191', '1041-4347', '2326-3865']

DOI: https://doi.org/10.1109/tkde.2020.3047894